Significance analysis of high-dimensional, low-sample size partially labeled data
نویسندگان
چکیده
منابع مشابه
Significance Analysis of High-Dimensional, Low-Sample Size Partially Labeled Data
Classification and clustering are both important topics in statistical learning. A natural question herein is whether predefined classes are really different from one another, or whether clusters are really there. Specifically, we may be interested in knowing whether the two classes defined by some class labels (when they are provided), or the two clusters tagged by a clustering algorithm (wher...
متن کاملEffective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data
In the so-called high dimensional, low sample size (HDLSS) settings, LDA possesses the “data piling” property, that is, it maps all points from the same class in the training data to a common point, and so when viewed along the LDA projection directions, the data are piled up. Data piling indicates overfitting and usually results in poor out-of-sample classification. In this paper, a novel appr...
متن کاملSparse Linear Discriminant Analysis with Applications to High Dimensional Low Sample Size Data
This paper develops a method for automatically incorporating variable selection in Fisher’s linear discriminant analysis (LDA). Utilizing the connection of Fisher’s LDA and a generalized eigenvalue problem, our approach applies the method of regularization to obtain sparse linear discriminant vectors, where “sparse” means that the discriminant vectors have only a small number of nonzero compone...
متن کاملCox model in high dimensional and low sample size settings
The evolution of cancer is more certainly linked to a complex interplay of genes rather than a single gene activity. Multivariate analysis, which can exploit the correlated pattern of gene expression display by genes behaving jointly, such as genes performing the same functions or genes operating along the same pathway, can become a very useful diagnostic tool to determine molecular predictor o...
متن کاملPartition clustering of high dimensional low sample size data based on p-values
This thesis introduces a new partitioning algorithm to cluster variables in high dimensional low sample size (HDLSS) data and high dimensional longitudinal low sample size (HDLLSS) data. HDLSS data contain a large number of variables with small number of replications per variable, and HDLLSS data refer to HDLSS data observed over time. Clustering technique plays an important role in analyzing h...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Statistical Planning and Inference
سال: 2016
ISSN: 0378-3758
DOI: 10.1016/j.jspi.2016.03.002